Genome-wide screening identifies Polycomb repressive complex 1.3 as an essential regulator of human naïve pluripotent cell reprogramming

Uncovering the mechanisms that establish naïve pluripotency in humans is crucial for the future applications of pluripotent stem cells including the production of human blastoids. However, the regulatory pathways that control the establishment of naïve pluripotency by reprogramming are largely unknown. Here, we use genome-wide screening to identify essential regulators as well as major impediments of human primed to naïve pluripotent stem cell reprogramming. We discover that factors essential for cell state change do not typically undergo changes at the level of gene expression but rather are repurposed with new functions. Mechanistically, we establish that the variant Polycomb complex PRC1.3 and PRDM14 jointly repress developmental and gene regulatory factors to ensure naïve cell reprogramming. In addition, small-molecule inhibitors of reprogramming impediments improve naïve cell reprogramming beyond current methods. Collectively, this work defines the principles controlling the establishment of human naïve pluripotency and also provides new insights into mechanisms that destabilize and reconfigure cell identity during cell state transitions.


INTRODUCTION
Pluripotency, the ability of individual cells to give rise to all the tissue lineages of a mature organism, is a fundamental process that we have yet to understand fully. In human development, pluripotency emerges in the unspecialized epiblast cells of preimplantation embryos and lasts for 2 weeks until the postimplantation embryo gastrulates and lineages are specified (1). During this period, pluripotent cells isolated from embryos give rise to unspecialized human pluripotent stem cells (PSCs) in the culture dish that retain their developmental potential and characteristics (2)(3)(4)(5). Pluripotency is also acquired when somatic cells are reprogrammed to become induced PSCs (iPSCs), resulting in a cell type that is largely indistinguishable from embryo-derived PSCs (6,7).
PSCs exist in two main states that are termed naïve and primed (8,9). Both cell states can self-renew and undergo multilineage differentiation but are functionally and molecularly distinct. Naïve PSCs largely recapitulate the transcriptome, epigenome, and differentiation potential of preimplantation embryos, and primed PSCs are similar to early postimplantation embryos (10)(11)(12)(13)(14)(15)(16). This developmental identity uniquely endows naïve PSCs with sought-after properties, including the ability to generate extraembryonic cells and entire blastocyst-like structures (17)(18)(19)(20)(21)(22)(23)(24) and for providing a model to study early developmental events, such as X chromosome inactivation (11,12). Naïve PSCs can be obtained directly from human preimplantation embryos, but more commonly, these cells are generated by reprogramming primed PSCs to a naïve state by exposing them to conditions that induce their cell state conversion (25)(26)(27)(28)(29)(30)(31). However, as with most reprogramming systems, the efficiency of reprogramming to a naïve state is low and produces a high level of cell heterogeneity (28,29,32). Furthermore, the vast majority of induced somatic cell reprogramming experiments generate primed PSCs and not naïve PSCs; as a result, we know very little about the reprogramming of human cells to naïve pluripotency. Recent studies have described transcriptional changes that occur in cells during naïve reprogramming (29,32); however, we do not know which factors and pathways are involved or are required for this process. Thus, there is a fundamental gap in understanding the mechanisms that control the entry of human cells into naïve pluripotency, thereby hindering our knowledge of early human development, limiting improvements to reprogramming protocols, and preventing the full potential of these cells from being achieved.

Defining the essential regulators of human naïve cell reprogramming
We set out to define the genes that regulate the reprogramming of primed PSCs into a naïve state using a genome-wide CRISPR-Cas9based screen. We first integrated the Cas9 coding sequence under the control of a CAG promoter into the safe harbor AAVS1 locus in primed PSCs and confirmed that this modified cell line could reprogram to a naïve state with the expected proportion of naïve cells within the population ( fig. S1, A and B). Cas9-expressing primed PSCs were then transduced with an optimized human v3 CRISPRbased loss-of-function mutant library at a multiplicity of infection (MOI) of 0.3 and a library representation of >100 cells infected per singleguide RNA (sgRNA). The library consisted of 112,522 sgRNAs targeting 18,365 genes and 1004 gRNAs targeting negative control regions (data S1). The sgRNA plasmids also contained puromycin resistance and blue fluorescent protein (BFP) markers (fig. S1A). After 3 days of puromycin selection, >95% of cells expressed sgRNA plasmids, as revealed by flow cytometry analysis of BFP signal (fig. S1C). Transduced cells were reprogrammed using 5i/L/A conditions (26), and on day 10, a panel of cell surface markers (32) was used to flow-sort two populations: (i) nascent naïve PSCs (i.e., successfully reprogrammed) and (ii) refractory cells negative for naïve markers (i.e., not reprogrammed cells) (Fig. 1A and fig. S2A). Prior colony-forming assays and molecular characterization showed that the reprogrammed population at this stage, although comprising <5% of the cell population, contained all of the cells capable of generating naïve cultures ( fig. S2, B and C) (32). Genomic DNA was extracted from the two flow-sorted cell populations, and the abundance of each gRNA was measured by high-throughput sequencing. RNA sequencing (RNA-seq) libraries were also prepared from the same samples, and as expected, the transcriptional profiles of the two cell populations showed a strong correlation with previous naïve cell reprogramming experiments (R > 0.95; fig. S2D) (32).
Comparing gRNA counts between the two isolated cell populations using the MAGeCK algorithm (33), which takes into account the multiple sgRNAs per gene, identified gRNAs that target 446 genes (2.4% of all genes targeted) that were significantly underrepresented in the nascent naïve cell population; these genes are therefore essential for reprogramming (P < 0.02, permutation test; Fig. 1, B and C, and data S2). Conversely, we identified a similar number of genes (n = 540; 2.9%) whose gRNA counts were significantly overrepresented in the nascent naïve PSC population, which correspond to genes that impede reprogramming and whose targeted deletion led to enhanced reprogramming (P < 0.02, permutation test; Fig. 1, B and C; and fig. S3, A and B; and data S2). Examining the distribution across cellular compartments revealed that, compared to the background set of genes, essential and impediment genes encode proteins that are strongly enriched for factors localized in the nucleus (P < 5 × 10 −6 , Fisher's exact test; fig. S3, C and D). Only a minority of the essential and impediment genes identified in the screen changed their expression levels between primed PSCs and nascent naïve cells (2.3%; fig. S3, E to G) or between reprogrammed and refractory cells (2.2%; Fig. 1D). This finding shows that the induction of new genes is not typically  Red, gRNAs that target impediment genes; blue, gRNAs that target essential genes; purple, negative control gRNAs; gray, all other sgRNAs. (C) Ranked differential enrichment (DE) plots by comparing nascent naïve with refractory populations. Numbers of identified impediment and essential genes are shown (applying a cutoff of P < 0.02, permutation test) together with the percent out of all genes targeted. (D) MA plot shows that the majority of essential and impediment genes are expressed at similar levels in nascent naïve and refractory cells. Transcriptional data are shown for each gene (represented by a dot), comparing nascent naïve and refractory cell populations; dark gray, differentially expressed; light gray, all other genes. Essential and impediment genes are colored in blue and red, respectively. required for primed to naïve reprogramming but instead that existing factors are repurposed to induce a cell state change.
Essential regulators of naïve cell reprogramming are distinct from genes required for primed cell reprogramming or proliferation We next examined the genes that were identified as being essential for primed to naïve PSC reprogramming. There was only a small overlap (~2%) between our list of essential genes and the genes that are required to reprogram human fibroblasts into primed-state induced pluripotent stem cells (iPSCs) ( Fig. 2A) (34). Genes that are common to both studies include SALL1, ZNF32, RAD21, and POU5F1 (also known as OCT4). Examining the differences, we found that many of the genes that are essential for fibroblast to primed iPSC reprogramming are associated with pathways that mediate cell adhesion and mesenchymal-to-epithelial transition, which are processes that are less relevant in primed to naïve PSC reprogramming because primed pluripotent cells are epithelial. Overall, these results suggest that the factors required for entry into the naïve pluripotent state are largely different from those required for converting somatic cells to primed pluripotency.
In addition, to stringently determine the effects of essential genes on naïve reprogramming rather than on primed PSCs, we integrated the results from a second genetic screen in the same primed PSC line, which identified genes required to maintain the proliferation of primed PSCs. We identified 64 essential genes in our reprogramming screen that overlapped with the list of genes that are required for primed PSCs (P < 0.02 in primed PSC screen, permutation test; Fig. 2B). Because these common genes might have roles in sustaining pluripotency in multiple states, we prioritized in follow-up experiments the 382 genes that are implicated in naïve reprogramming (data S2). Together, these findings have identified the factors that regulate naïve PSC reprogramming and establish that essential regulators of naïve cell reprogramming are mostly distinct from those genes that are required for primed PSC reprogramming or proliferation.
Signaling, transcription, and chromatin regulators including Polycomb repressive complex 1. 3 The genes essential for naïve cell reprogramming are strongly enriched for functions associated with transcriptional regulators and in chromatin modifying pathways ( Fig. 2C and fig. S3, H to J). Visualizing the essential factors based on their known interactions revealed that many, sometimes all, proteins within a particular complex were identified, thereby implicating not only individual genes but also whole complexes that are required for naïve PSC  F   KAT2A  KAT2B  TADA2B  TADA2A  TADA3  SGF29  SUPT20H  SUPT3H  SUPT7L  TADA1  TAF5L  TAF6L  TAF9  TAF10  TAF12  TRRAP  ATXN7  ATXN7L3  ENY2  USP22 HAT  1  RING1  RNF2  RYBP  BCOR  BCORL1  KDM2B  SKP1  USP7  PCGF1  CBX2  CBX4  CBX6  CBX7  CBX8  PHC1  PHC2  PHC3  PCGF2  BMI1  AUTS2  FBRS  CSNK2A1  CSNK2A2  CSNK2A3  CSNK2B  PCGF3  PCGF5  E2F6  HDAC1  HDAC2  L2MBTL2  L3MBTL3  MAX  MGA  CBX3  TFDP1  TFDP2  WDR5 A and B) Venn diagrams show the overlap between genes that are essential for naïve PSC reprogramming and genes that are essential for (A) fibroblast to primed iPSC reprogramming (34) and (B) primed PSC proliferation. (C) Charts show the adjusted P value for essential genes in the biological processes gene ontology (GO) category (Fisher's exact test). (D) Ranked DE plot by comparing nascent naïve with refractory populations. Essential genes are highlighted. The top 15 ranked essential genes are shown; SAGA components, purple; PRC1.3 components, green. (E and F) Charts show the P values of (E) PRC1 members and (F) SAGA complex members, as a measure of their depletion in the nascent naïve cell population following the CRISPR-Cas9 screen. The red line indicates P = 0.02 as a cutoff (permutation test). Schematics of the complexes are also shown; bold lines indicate components classified as an essential gene.
reprogramming ( fig. S4A). Most of the identified genes and complexes have no prior connections to naïve reprogramming or human pluripotency.
In particular, the top ranked hits included multiple members of the Polycomb repressive complex 1.3 (PRC1.3) and the SAGA (Spt-Ada-Gcn5 acetyltransferase) complex (Fig. 2, D to F). PRC1 complexes typically repress transcription, acting through the mono-ubiquitylation of histone H2A lysine 119 (H2AK119ub1), the recruitment of PRC2, and chromatin compaction (35). PRC1 consists of the ubiquitin ligases RING1A/B and one of six homologs of the Polycomb group RING finger (PCGF1 to PCGF6) proteins that define distinct subunit assemblies. Canonical PRC1 contains a Chromobox (CBX) protein that binds H3K27me3, whereas variant PRC1 lacks CBX and contains RING1 and YY1 binding protein (RYBP)/YY1 associated factor (YAF) and subunit-specific auxiliary factors. In our screen for factors essential for naïve PSC reprogramming, we identified RING1B, PCGF3, FBRS, RYBP, and several CK2 genes within the list of significant hits, and these genes comprise nearly all of the known components of PRC1.3 (Fig. 2E). PCGF3's close homolog, PCGF5 (a component of PRC1.5), had only a minor impact on naïve PSC reprogramming in our screen, presumably because PCGF5 expression is low in PSCs. The identification of PRC1.3 was specific: Out of the other PRC1 subtype components, only PCGF1 and PCGF2 were also essentially required for naïve PSC reprogramming; however, no other PRC1.1-, PRC1.2-, or PRC1.4-specific components were within the set of essential genes (Fig. 2E).
Another prominent complex identified in the screen as being required for naïve reprogramming was the multiprotein SAGA transcriptional coactivator complex. Of 18 core SAGA components, 7 ranked within the top 15 hits in our screen (Fig. 2D) and 11 were within the list of genes essential for reprogramming (Fig. 2F). Most of these genes are not required for generating nonreprogrammed, refractory cells (data S1) or for primed PSC proliferation, indicating that SAGA function is needed specifically for naïve reprogramming. None of the SAGA components significantly changed their expression levels during reprogramming ( fig. S4B). SAGA functions through multiple catalytic activities, including histone acetyltransferase (HAT) and deubiquitylation (DUB) (36). Our screen results point to an essential role for the HAT GCN5 (KAT2A) and other members of the HAT module, with a less-prominent role for the DUB module, with only one of the four components required (Fig. 2F). Other large regulatory complexes with components in the cohort of essential factors included the Mediator complex ( fig.  S4C). Additional regulators with core cellular functions were also identified ( fig. S4A), which, as expected, were also required for primed PSC proliferation (Fig. 2B).
Our genetic screen also identified signaling components and transcription factors that were essential for naïve PSC reprogramming ( fig. S4, D and E). One example is for components involved in the assembly of the WNT/-catenin-T cell factor (TCF) complex (q = 0.01, Fisher's exact test with Benjamini-Hochberg correction; Fig. 2C and fig. S4E). All components are highly expressed in the pluripotent epiblast cells of the preimplantation human embryo, raising the possibility that similar signaling events could also be present during embryo development ( fig. S4F). The identified essential factors collectively inhibit WNT signaling, either by reducing -catenin levels or by inhibiting the transcriptional activation downstream of -catenin and TCF family members. This is consist ent with small-molecule inhibition of WNT/-catenin signaling stabilizing naïve pluripotency (30,31,37,38) and indicates that WNT pathway inhibition exerts a strong effect on reprogramming even under conditions containing a glycogen synthase kinase 3 (GSK3) inhibitor. The results of our screen show that many of these factors are also required genetically during naïve PSC reprogramming and, importantly, identify the key regulators within this signaling pathway that are central to the stabilization of naïve pluripotency. Together, our genome-wide discovery screen has revealed that disrupting pathways associated with chromatin acetylation and ubiquitination, WNT signaling inhibition, and transcription factors are detrimental for naïve PSC reprogramming.

PRC1.3 is required for naïve PSC reprogramming
PRC1.3 has not been investigated in human pluripotency or reprogramming and, in general, is understudied compared to other Polycomb complexes. Given the unexpected link between the known functions of PRC1.3 and the strong phenotype in the genetic screen, we decided to further investigate mechanisms of PRC1.3 in naïve PSC reprogramming. Furthermore, because PRC1.3 components are expressed in pluripotent cells of the human preimplantation embryo ( fig. S5A), understanding PRC1.3 function is also relevant for how cells establish naïve pluripotency during early development. We used CRISPR-Cas9 to delete the core factor PCGF3 (ranked no. 2 of 18,365 genes) in primed PSCs, which caused the dissociation of PRC1.3 ( fig. S5, B to E). PCGF3 knockout (KO) cells grew normally under primed PSC conditions with unaltered proliferation rates, and they remained undifferentiated and maintained a standard transcriptional program and expression of marker genes (fig. S5, F to K). The absence of a strong phenotype in self-renewing primed PSCs is consistent with the results of our second CRISPR-Cas9 screen, which indicated that PCGF3 was not required for primed PSC proliferation.
We next initiated primed to naïve PSC reprogramming using 5i/L/A conditions in two PCGF3 KO cell lines and parental wildtype (WT) cells. WT cells produced characteristic naïve PSC colonies after 10 days of reprogramming, whereas, in contrast, PCGF3 KO PSCs produced flattened, dispersed colonies that resembled neither primed nor naïve PSCs (Fig. 3A). Corroborating these morphological differences, immunofluorescence microscopy confirmed that very few colonies in the KO cultures expressed the naïve PSC marker KLF17, and a greater proportion of the colonies were differentiated (Fig. 3, B and C). Flow cytometry analysis of multiple pluripotent state cell surface markers (32,(39)(40)(41) showed that the proportion of reprogrammed naïve PSCs in the cell population was substantially lower in the KO cells (<1%) compared to WT cells (~19% CD75 + /CD24 − ; Fig. 3D; ~29% CD75 + /SUSD2 + ; fig. S5L). The KO cells were not simply delayed in reprogramming because we observed the same phenotype after 24 days of reprogramming ( fig. S5M), and after 60 days of continuous culture, there were no cells remaining in the KO cultures.
Consistent with this notable phenotype, RNA-seq analysis of bulk populations showed that the transcriptional profiles of WT and PCGF3 KO cells differed substantially at day 10 of reprogramming (Fig. 3E). Differential gene expression analysis revealed that naïve pluripotency marker genes were up-regulated during reprogramming in WT cells but not in PCGF3 KO cells (Fig. 3, F and G, and fig. S5N). Furthermore, decreased levels of pan-pluripotency markers and induction of differentiation genes indicate that the KO cells have exited the pluripotent state under these conditions (Fig. 3, F and G). To determine whether the reprogramming phenotype extended to alternative methods of naïve PSC reprogramming, we tested our cell lines under chemical resetting (CR) conditions (30). Under these conditions, PCGF3 KO cells were also severely compromised in their ability to reprogram into a naïve state compared to WT cells ( Fig. 3H and fig. S5, O and P). As additional controls, we initiated reprogramming with PCGF3 heterozygous cells and with PCGF3 KO cells that expressed a PCGF3 transgene ( fig.  S5Q). These PSC lines had a much higher proportion of nascent naïve cells in the population compared to the PCGF3 KO PSCs, indicating that reprogramming ability was restored ( Fig. 3H and   fig. S5, O and P). Last, deletion of an alternative PRC1.3 component, FBRS, also resulted in cells that generated a strongly reduced proportion of reprogrammed cells compared to WT cells analyzed at the same time point, as predicted by our screen (fig. S6, A to E). Together, these results establish a critical new role for PRC1.3 in reprogramming human cells into a naïve state.

Misregulation of PRC1.3 target genes is associated with a failure of naïve reprogramming
To investigate the role of PRC1.3 in reprogramming, we used Chromatin immunoprecipitation sequencing (ChIP-seq) in naïve and primed PSCs to identify a stringent set of genes that are co-occupied     The majority of genes (>80%) that retained or gained PRC1.3 occupancy during primed to naïve reprogramming were downregulated during reprogramming ( Fig. 4C and fig. S7C). RNA-seq of reprogramming cells revealed that many PRC1.3 target genes were aberrantly expressed in PCGF3 KO cells, whereas these genes were repressed in WT cells (Fig. 4, D and E, and fig. S7D). Furthermore, cell sorting and RNA-seq revealed that PRC1.3 target genes were transcriptionally down-regulated in nascent naïve cells compared to primed PSCs but were not down-regulated in refractory cells that failed to reprogram (Fig. 4F). Together, these results show that PRC1.3 targets a cohort of developmental and signaling factors for transcriptional repression and that the inability to silence these genes is associated with the failure to reprogram cells into a naïve state.

Pluripotent state-specific PRC1.3 composition
The expression levels of most PRC1.3 complex components do not change during naïve reprogramming (Fig. 5A). We therefore sought to determine whether the composition of PRC1.3 differs between naïve and primed PSCs. We used quantitative, multiplexed rapid immunoprecipitation mass spectrometry (MS) of endogenous protein (qPLEX-RIME) (42,43) to identify proteins that interact on chromatin with PCGF3, which is a component that is specific to PRC1.3 ( fig. S7E and data S3) (44,45). Known PRC1.3 complex proteins were identified in primed PSCs, and those components were much less abundant in primed PSCs that lack PCGF3 and PCGF5, thereby confirming the specificity of the assay ( fig. S7F). Comparing PCGF3-associated proteins between naïve and primed PSCs showed that the relative abundance of most PRC1.3 components was similar between the two cell types (Fig. 5, B and C). Unexpectedly, however, the abundance of two PRC1.3 paralog proteins, Fibrosin (FBRS) and Activator of transcription and developmental regulator (AUTS2), differed, whereby AUTS2 interacted with PCGF3 in primed but less in naïve PSCs and vice versa for FBRS (Fig. 5, B and C). We initially hypothesized that the paralog switch occurs in response to the signaling inhibitors within the reprogramming cocktail. However, analysis of PCGF3-interacting proteins in primed PSCs that were cultured in 5i/L/A for 48 hours showed that AUTS2 was still the more abundant PRC1.3 paralog (Fig. 5C). Instead, this switch is likely to be driven by paralog availability, as FBRS is more abundant in naïve PSCs and then transitions to AUTS2 as the dominant paralog in primed PSCs FBRS or AUTS2, respectively, with few cells (<2%) coexpressing both paralogs (Fig. 5E). By day 10 of reprogramming, AUTS2 transcript levels were strongly reduced in the nascent naïve cells but remained highly expressed in refractory cells that failed to reprogram (Fig. 5F). Conversely, FBRS is moderately up-regulated both in nascent naïve PSCs and in refractory cells (Fig. 5F). This switch in paralog expression is consistent with the results of our prior screen, which identified an essential role in naïve reprogramming for FBRS (ranked no. 10), but AUTS2 was not required (ranked no. 14,751) (Fig. 2E).
Transcriptional analysis of naïve to primed PSC transition revealed a changeover from FBRS to AUTS2 expression (Fig. 5G). FBRS and AUTS2 also showed anticorrelated expression patterns during human embryo development; FBRS is more highly expressed in early epiblast and then switches over to AUTS2 as the more abundant paralog in postimplantation epiblast cells (Fig. 5H). Similar expression patterns are also observed in cynomolgus monkey development ( fig. S7G)  ( Fig. 5I). Thus, the FBRS to AUTS2 paralog switch occurs in epiblast cells during embryo implantation and is recapitulated in naïve to primed PSCs transitions. Curiously, several AUTS2 transcriptional start sites are bound by PRC1.3 in naïve but not in primed PSCs ( fig. S7H), which is consistent with the low levels of AUTS2 in naïve PSCs (Fig. 5, D and G). This suggests that there could be an interesting self-regulation of the two paralogs involving PRC1.3 itself. Together, these results establish that, although the composition of PRC1.3 is largely retained between naïve and primed PSCs, there is a paralog switch between FBRS and AUTS2 that occurs upon pluripotent state transitions and also during the implantation phase of human development.

PRDM14 interacts with PRC1.3 to ensure naïve cell reprogramming and gene regulation
We next investigated the regulation of PRC1.3 in human pluripotent states. The qPLEX-RIME data revealed differences in PRC1.3 interactions on chromatin between the two cell types ( Fig. 6A and data S3). This included naïve-enriched proteins, such as the linker DNA binding histone protein HISTH1.1, the DNA methyltransferase regulator DNMT3L, and the transcription factor PR/SET domain 14 (PRDM14) (Fig. 6A). PRDM14 is a central regulator of pluripotency (46,47) and was of particular interest because, similar to PRC1.3, it was also essential for naïve PSC reprogramming (ranked no. 74) (Fig. 6A and fig. S4D). We confirmed the interaction between PRC1.3 and PRDM14 in naïve PSCs by coimmunoprecipitation (Co-IP) ( fig. S8A). This association was further supported by motif analysis of PRC1.3 peaks in naïve PSCs, which revealed that the PRDM14 motif was the highest enriched motif at PRC1.3 peaks in naïve PSCs (Fig. 6B), whereas the PRDM14 motif was not enriched at PRC1.3 peaks in primed PSCs or in regions matched for GC content. Moreover, in naïve PSCs, the proportion of PRC1.3 peaks containing a PRDM14 motif was significantly higher compared to RING1B sites that lacked PCGF3 occupancy (53% versus 6%; P < 0.0001, two-sided Fisher's exact test) and to regions matched for GC content (53% versus 5%;

of 19
P < 0.001, two-sided Fisher's exact test). Analysis of ChIP-seq data (47) showed that PRDM14 was bound at the majority (72%) of PRC1.3 peaks in PSCs cultured in 4i naïve medium, and the ChIP signal was reduced to background levels following the induced degradation of PRDM14-AID-VENUS (Fig. 6, C and D) (47). PRDM14 occupancy was significantly higher at PRC1.3 peaks compared to RING1B-only bound sites (Fig. 6E and fig. S8B), which further supports a specific association between PRDM14 and PRC1.3. Furthermore, PRC1.3 target genes were transcriptionally derepressed following the acute depletion of PRDM14 in naïve PSCs (Fig. 6, D and F,  and fig. S8, C and D). Last, we initiated naïve cell reprogramming of primed PRDM14-AID-VENUS PSCs ( fig. S8, E and F) (47). The ability to reprogram into the naïve state was low for this cell line even in the presence of PRDM14-AID-VENUS (Fig. 6G), potentially because of impaired function of the fusion protein. Nevertheless, no nascent naïve PSCs were obtained when PRDM14 was degraded during reprogramming, which demonstrates that PRDM14 is required for naïve cell reprogramming (Fig. 6G). Together, these results establish that PRDM14 and PRC1.3 function together to control target gene repression and to ensure the reprogramming of cells into naïve pluripotency.

Overcoming reprogramming impediments to improve naïve PSC reprogramming
We next investigated the genes that impede naïve PSC reprograming that were identified in our CRISPR-Cas9 screen (Fig. 7A). Our genetic screen identified HDAC2 as a strong impediment of naïve PSC reprogramming (ranked no. 40 of 18,365 impediment genes); cells targeted with HDAC2 gRNAs were 100-fold enriched in the successfully reprogrammed cell population compared to refractory cells (Fig. 7B). The deletion of the other histone deacetylase (HDAC)encoding genes had little impact on naïve PSC reprogramming (Fig. 7B) despite being expressed at similar levels (Fig. 7C). HDAC2 is a component of three distinct complexes, Swi-independent 3 (SIN3), Nucleosome remodelling and deacetylase (NuRD), and Cofactors of repressor element-1 silencing transcription factor (CoREST). Results from our CRISPR screen show that most members of the SIN3 and NuRD complexes can be deleted without affecting naïve PSC reprogramming (Fig. 7D). In contrast, members of the CoREST complex, RCOR1 (ranked no. 42 in the list of impediments; P = 0.0002, permutation test) and KDM1A (ranked no. 815 in the list of impediments; P = 0.03, permutation test), are promising candidates that impede naïve PSC reprogramming and whose targeted deletion led to enhanced reprogramming (Fig. 7D). The discovery of HDAC2 as a major reprogramming impediment is important because the treatment of cells with pan-HDAC inhibitors increases reprogramming and transdifferentiation efficiency in many contexts, including toward naïve PSCs, although it is unknown which HDACs elicit this effect (30,(48)(49)(50). We therefore tested whether the targeted inhibition of HDAC2 could facilitate naïve PSC reprogramming. We first examined whether the broad-spectrum HDAC inhibitor valproic acid (VPA), a component of CR medium, could be replaced with selective HDAC2 inhibitors BRD4884 and BRD6688 (51). Both HDAC2 inhibitors generated nascent naïve cells with high efficiencies, and the HDAC2 inhibitors were effective at 100-fold lower concentrations than VPA (Fig. 7, E and F). We additionally found that HDAC2 inhibitors were also effective under alternative reprogramming conditions. We supplemented 5i/L/A medium with BRD4884, BRD6688, or VPA at equal concentrations for the first 3 days of reprogramming. At day 10 of reprogramming, the proportion of nascent naïve PSCs in the cell population following HDAC2 inhibition was two-to threefold higher than VPA (Fig. 7G). The naïve cells generated by HDAC2 inhibitor treatment were propagated and formed stable cell lines (Fig. 7, H and I). These results establish that supplementation with HDAC2 inhibitors is of strong practical benefit that can improve on current reprogramming methods.

DISCUSSION
Here, we identify a comprehensive set of regulators that control the establishment of naïve pluripotency in human cells. Our study has uncovered crucial complexes and pathways that are essential for primed to naïve cell reprogramming and, additionally, those that create barriers to impede this process. The results describe the first genome-wide CRISPR-Cas9-based functional screen in human cell reprogramming and provide an important new dataset that can be mined by the scientific community to understand the processes controlling human pluripotent cell identity.
Our screen led us to identify an essential new role for the noncanonical Polycomb repressive complex PRC1.3 in naïve PSC reprogramming. Expanding upon this finding, we dissected the mechanism whereby PRC1.3 and PRDM14 transcriptionally repress a set of developmental, chromatin, and signaling regulators and that the failure to silence these factors is associated with a failure to reprogram. It is likely that not all PRC1.3 target genes have detrimental effects, but having identified this gene set, these candidates can be examined in further studies as potential disruptors of cell reprogramming. PRC1.3 was required for naïve reprogramming under different conditions (5i/L/A and CR). This indicates that each condition could transition cells using similar regulatory pathways, which is consistent with their similar transcriptional trajectories during reprogramming (29). Several of the PRC1.3 target genes are associated with neural differentiation, and these genes had elevated expression in refractory cells and also when PRC1.3 was inactivated.
A previous study showed that most cells that fail to reprogram to naïve pluripotency adopt a neural phenotype (32). PRC1.3, therefore, might serve as a transcriptional repressor to silence alternative lineage fates, and potentially changing the strength of this repression can alter the balance either toward successful naïve cell reprogramming or to refractory cells. This hypothesis is supported by the high ranking of several neural determinants as reprogramming impediments. Together, these results suggest that blocking alternative routes might help channel cells along the correct path. This has important implications for the improved design of reprogramming strategies.
Our study also uncovered new insights into PRC1.3 regulation and function. Unusually for a PRC1 complex, PRC1.3 lacks an auxiliary component that binds directly to DNA sequences or to specific chromatin modifications. Instead, our results lead us to propose that PRDM14, a pluripotency transcription factor, recruits PRC1.3 to specific target sites during cell reprogramming. This proposal is supported by the specific association between PRC1occupied sites and PRDM14 binding and motif enrichment, together with the interaction between PRC1.3 and PRDM14 in the absence of DNA and also when bound to chromatin. It is currently unclear whether the association between PRC1.3 and PRDM14 is facilitated by the higher levels of PRDM14 in naïve compared to primed PSCs or alternatively by changes to the composition or modifications to PRC1.3 that could potentially enable this interaction. Our findings extend prior work showing that PRDM14 recruits an alternative Polycomb repressive complex, PRC2, to stabilize naïve pluripotency in mouse (52,53). In mouse embryonic stem cells, the DNA binding protein Upstream transcription factor 1 (USF1) could have a similar role as an auxiliary factor for PRC1.3 recruitment (54). In contrast to our study, however, USF1-PRC1.3 was associated with gene activation rather than repression (54); therefore, different recruitment mechanisms and subunit composition may endow PRC1.3   with different functional properties (55). PRC1.3 could also have additional roles that are independent of developmental gene regulatory control including low-level genome-wide coverage to constrain transcriptional activation or associations with noncoding RNAs (54,56,57). Examining these possibilities in the context of pluripotent state transitions is an exciting area for future investigation.
We also uncovered a developmentally controlled switch that occurs between the paralogs FBRS and AUTS2, whereby FBRScontaining PRC1.3 is the dominant complex in naïve PSCs and in preimplantation human embryos. Consistent with this expression profile, FBRS, but not AUTS2, is required for naïve cell reprogramming. Previous studies have shown that the integration of AUTS2 within a related complex, PRC1.5, is associated with transcriptional activation (55,58); however, PRC1.3-AUTS2 in primed PSCs does not seem to be associated with activation. This difference could potentially be due to the expression of different AUTS2 isoforms, which have distinct and incompletely understood roles in transcriptional repression and activation (59). Protein interaction studies have identified FBRS and AUTS2 as components of PRC1.3 (44,45), but it was not known that their association with PRC1.3 could differ or that each version of the complex could have distinct and separable requirements. This finding underscores the remarkable and dynamic variation in the composition of PRC1 complexes to adapt to specific roles. Further examination of PRC1.3 during the transition from naïve to primed PSCs could serve as a paradigm for how cell type-specific variants of chromatin-modifying complexes act to control cell state changes. In particular, an important future challenge remains in understanding whether the switch between FBRS and AUTS2 paralogs endows PRC1.3 with distinct biochemical or functional roles.
Naïve PSCs harbor desirable and unique properties, including a seemingly unrestricted developmental potential encompassing the ability to efficiently generate hypoblast and trophoblast cell types including human blastoids (17)(18)(19)(20)(21)(22)(23)(24), in addition to serving as a model for peri-implantation developmental processes, such as the reconfiguration of the epigenome (11,12). Progress in this area has been hampered because of an inadequate understanding of the mechanisms that control naïve pluripotency and also restricted by suboptimal growth conditions. We anticipate that the new insights obtained in this study should lead to improved reprogramming methods that are designed to overcome specific impediments with minimal impact on nontarget pathways. As a first step, we demonstrated the feasibility of this approach by identifying HDAC2 as a key reprogramming impediment and show that replacing a broadspectrum HDAC inhibitor with a selective HDAC2 inhibitor is of strong practical benefit that can improve on current reprogramming conditions. Advances based on this and other reprogramming impediments identified here should facilitate the increased use and exploitation of naïve PSCs in research and clinical applications.
There are several limitations that are associated with the genomewide CRISPR-Cas9 screen. First, our screen was designed to compare the sgRNA counts between the nascent naïve cell population and the refractory population, but a blind spot of this is that gene KOs that are depleted before reprogramming would be missed. It is possible that some of those genes could have roles in cell reprogramming that we would not detect. Second, the screen is unable to identify genes that have cell nonautonomous effects, as any consequence of deleting this type of gene would be masked by most of the cells in the population that retain the gene's function. Third, this type of genetic screen can be affected by clonal or proliferation effects that could lead to changes in sgRNA counts when comparing between cell populations. This effect is a limitation of the study and one that we tried to mitigate by minimizing the reprogramming duration.
Together, we have uncovered a comprehensive set of factors involved in naïve human PSC reprogramming. These findings provide new insights into the principles controlling human pluripotent states and have established a new role for variant Polycomb complexes in pluripotent state transitions. We anticipate that these discoveries will lead to the increased use and exploitation of human naïve PSCs including the advancement of human embryo models.

MATERIALS AND METHODS
Reagent and resource details are provided in data S4.

Genome-wide CRISPR KO screens
Cas9, driven by the CAG promoter, was integrated into the AAVS1 locus in the FSPS13B primed iPSC line to produce a FSPS13B CAG-CAS9 BP primed PSC line. To achieve this, 1 million primed iPSCs were nucleofected with 5 g of pZFN-AAVS1-ELD (Addgene, #159297), 5 g of pZFN-AAVS1-KRR (Addgene, #159298), and 2 g of pAAVS1-CAG-hCAS9-neo (Addgene, #166026) plasmids. After 48 hours posttransfection, positive clones were selected with G418 (250 g/ml). Clonal lines were expanded, and Cas9 integration was verified using junctional polymerase chain reaction (PCR) (AAVS1-GF4, CTTAGCCACTCTGTGCTGACCACTC and CAGSB-5′P2, CGTAAGTTATGTAACGCGGAACTCC for the left homology arm; bGHpA-U2, ATGCTGGGGATGCGGTGGGCTCT and AAVS1-GR3, CACAGGTGGCGCTTCCAGTGCTCAGACTAG). Cas9expressing clonal lines were checked for genome editing activity using our Cas9 reporter system, and we selected the most stable and active cell clone. We also checked the activity before genetic screening to ensure that cells had not undergone silencing during expansion. The resultant cell line, FSPS13B CAGCAS9 B1 primed PSCs, was cultured on Vitronectin-coated plates in complete TeSR-E8 medium. To perform a screen for genes essential for primed PSC proliferation, we transduced 30 million FSPS13B CAGCAS9 B1 primed PSCs by spinfection with the human v1 CRISPR gRNA library (61). Seventy-two hours after transduction, cells were harvested, and a transduction efficiency of ~30% was confirmed. The cells were then selected with puromycin for 3 days and further cultured under feeder-independent conditions (TeSR-E8 on Vitronectin) by replating 50 million cells at each passage. Twenty-one days after transduction, the cells were harvested for genomic DNA extraction. To perform a screen to identify genes involved in primed to naïve reprogramming, we transduced 30 million FSPS13B CAGCAS9 B1 primed PSCs by spinfection with the human v3 CRISPR gRNA library (62) at an MOI of 0.3 (63). Seventy-two hours after transduction, cells were harvested with Accutase and pooled, and a transduction efficiency of ~30% was confirmed by flow cytometry for BFP expression. The cells were seeded onto 10-cm culture dishes at a density of 65,000 cells/cm 2 and cultured in TeSR-E8 medium supplemented with puromycin. On day 6 after transduction, puromycin-selected cells were harvested with Accutase and confirmed by flow cytometry to be ~95% BFP positive. The short, 6-day, time window between cell transduction and the start of cell reprogramming was designed to minimize clonal or proliferation effects in the starting cell population. Fifty-one million cells were plated on to 17 culture dishes (15 cm) that were precoated with MEFs, in TeSR-E8 medium supplemented with 20% KSR and 10 M Y-27632. The following day, cells were rinsed briefly with DMEM/F12 and changed to 5i/L/A medium. Medium was replaced daily. Cells were passaged with Accutase on day 5 of reprogramming and split at a ratio of 1 to 2.5 so that 42 culture dishes (15 cm) were seeded. Medium was replaced daily. On day 10 of reprogramming, cells were harvested with Accutase and pooled. MEFs were depleted by labeling the samples with a biotin-conjugated anti-mouse Cd90.2 antibody (clone 30-H12; BioLegend) and using Streptavidin Microbeads (Miltenyi Biotec) to specifically isolate unlabeled human cells from a MACS LS column (Miltenyi Biotec) on a QuadroMACS magnetic separator (Miltenyi Biotec). The isolated human cells were then incubated in batches with a panel of antibodies against cell surface markers, as described in the flow cytometry methods section, and cell-sorted into two populations, nascent naïve cells and refractory cells, each containing ~7 million cells. Cells were pelleted by gentle centrifugation, and samples were frozen promptly for genomic DNA isolation. gRNA sequencing Genomic DNA was isolated from the day 21 cell populations for the primed PSC proliferation screen or from the two cell-sorted fractions for the reprogramming screen using the QIAamp DNA Blood Maxi Kit (QIAGEN) or DNeasy Blood and Tissue Kit (QIAGEN), depending on input cell numbers, following the manufacturer's instructions. The primed PSC proliferation screen was performed in triplicate. Given the substantial technical challenge of the reprogramming discovery screen, the experiment was performed once, with subsequent hits verified by individual gene KOs. gRNA amplification from genomic DNA and Illumina sequencing were performed as described previously (63). For the reprogramming screen, all available genomic DNA was used in the first-round PCR at 1 g per reaction in 25 reactions to maximize the coverage. Individual gene targeting Dual gRNAs were designed using WGE (www.sanger.ac.uk/science/ tools/wge) to excise an early exon that would cause a frameshift upon deletion. Sequences used for PCGF3 were TTAGGAGAGC-GTCTAGAGCCAGG and GGCACTCACCCCACGTACTGTGG; for PCGF5, GTTCTTCTTCAAAAACTGTTAGG and GACAATCCTAT-GCTTAGAAATGG; and for FBRS, ATAGGCATCCAGGCCCCATCT-GG and GTCACTAAGCAAGTGGAACCAGG. Each gRNA sequence was incorporated into a U6 target gRNA expression vector and synthesized as a gBlock (Integrated DNA Technologies). The gRNA gBlocks were subcloned into pCR2.1-TOPO (Thermo Fisher Scientific) and verified by sequencing. Primed PSCs were dissociated into single cells using Accutase, and 2 million cells were nucleofected with 4 g of pCas9_GFP (Addgene, #44719) and 3 g of each gRNA expression vector. After 48 hours, 10,000 green fluorescent protein (GFP)-positive single cells were isolated by flow cytometry and seeded onto MEF in a 10-cm tissue culture dish in primed PSC medium supplemented with 10 M Y-27632 for the first 24 hours. Individual clones were picked and expanded in 96-well plates and genotyped by PCR. Mutations were validated by DNA sequencing of TOPO cloned PCR products. PCGF3 rescue cell line RNA was extracted using an RNeasy Mini Kit (QIAGEN) and reverse-transcribed using SuperScript II Reverse Transcriptase (Thermo Fisher Scientific). The full-length PCGF3 coding sequence was amplified from the cDNA using HotStarTaq DNA polymerase (QIAGEN) using the primer sequences ACGACGCGTGCCAC-CATGTTGACCAGGAAGATCAAGCTG and ATAAGAATGCG-GCCGCTCACAGCAAGTCCATCTTGGGT. A 729-base pair (bp) PCR product was excised and cloned into a pCAG expression plasmid using Mlu I and Not I enzymes (Thermo Fisher Scientific). The resultant plasmid pCAG-PCGF3-ires-puro was confirmed by DNA sequencing. The plasmid was transfected into PCGF3-deficient primed PSCs using GeneJuice (Merck-Millipore). Cells were treated with puromycin (1 g/ml) for 48 hours, and resistant colonies were expanded. The expression of PCGF3 in the rescue cells was confirmed by Western blot. The line was selected on the basis of having the closest expression levels to WT cells, although PCGF3 levels in the rescue cells are higher than in the WT cells.

Characterization of cell lines
The use of human embryonic stem cells (ESCs) was carried out in accordance with approvals from the UK Stem Cell Bank Steering Committee. In this study, we used two independently derived PCGF3 KO clonal lines, one clonal PCGF3 heterozygous cell line, one PCGF3 rescue cell line, one FBRS KO clonal line, one PRDM14-AID-VENUS cell line [all WA09/H9 Embryonic Stem Cells (ESCs)], and the FSPS13B CAGCAS9 B1 cell line (derived from FSPS13B iPSCs). Parental WT cells described in the study correspond to untargeted parental WA09/H9 cells.
All cell lines used in this study were authenticated and confirmed to be mycoplasma negative. Pluripotent cell state and undifferentiated status were validated by protein marker expression. PCGF3 KO, FBRS KO, and PRDM14-AID-VENUS primed cells expressed OCT4 and NANOG and were >~95% SSEA4 positive (figs. S5, G and I; S6, B and C; and S8, E and F). PRDM14-AID-VENUS naïve cells expressed NANOG, OCT4, and KLF4 ( fig. S8D). Karyotype analysis of G-banded chromosomes (carried out by Cell Guidance Systems) confirmed that all cell lines have the normal complement of chromosomes, and representative karyotypes are shown in fig. S9A. Twenty cells were analyzed per cell line. For the FSPS13B CAGCAS9 B1 cell line, we examined the karyotype status of the cells that were used in the CRISPR screen by implementing a method called eSNP-Karyotyping (64). This approach enabled us to use the RNA-seq data collected from the samples at the time of the experiment to determine chromosome copy number. By adopting this approach, we examined the starting primed cells and the two cell populations that were flow-sorted for the CRISPR screen. The results show that all three samples have a normal chromosome copy number (fig. S9B). Flow cytometry Cells were dissociated using Accutase, washed, and passed through 50-m cell strainers (VWR). Conjugated antibodies and Fixable Viability Dye-eF780 (eBioscience) were mixed with 50 l of Brilliant stain buffer (BD Biosciences) and applied to 50 l of cells (500,000 cells per reaction). Cells were incubated for 30 min at 4°C in the dark and washed twice with 2% fetal bovine serum (FBS) in PBS and centrifuged at 300g for 5 min. Cells were resuspended in 2% FBS in PBS and analyzed at the Babraham Institute Flow Core with a BD LSRFortessa cell analyzer (BD Biosciences) or a BD FACSAria Fusion for cell sorting. Single-stained cells or OneComp eBeads (eBioscience) were used for compensation calculations.

Immunofluorescent microscopy
Cells were fixed for 15 min in 4% paraformaldehyde at room temperature and incubated in blocking and permeabilization solution (5% FBS and 0.1% Triton X-100 in PBS) for 1 hour at room temperature or overnight at 4°C. Cells were then incubated overnight in primary antibody at 4°C in blocking and permeabilization solution. After washing three times with blocking and permeabilization solution, cells were incubated with appropriate fluorescently conjugated secondary antibodies for 1 hour at room temperature. Cells were washed three times with PBS, with 4′,6-diamidino-2-phenylindole (0.5 g/ml; DAPI) to stain DNA included in the second wash. Cells were imaged on either a NIKON A1-R confocal microscope with a 20× oil objective or a Zeiss Axio Observer with the Apotome 3 for structure illuminated optical sectioning, and Z stack images were processed with ImageJ. Antibody details are as follows. Donkey anti-rabbit IgG (H+L) Highly Cross-Adsorbed Secondary Antibody CF-568 (Sigma-Aldrich, SAB4600076) and Donkey antimouse IgG (H+L) Highly Cross-Adsorbed Secondary Antibody Alexa Fluor Plus 488 (Invitrogen, A32766) were used at 1:800. RNA sequencing RNA was extracted in TRIzol reagent (Thermo Fisher Scientific). Indexed libraries were constructed from 500 ng of total RNA using the NEBNext Ultra RNA Library Prep Kit for Illumina with the Poly(A) mRNA Magnetic Isolation Module (NEB). Library fragment size and concentration were determined using an Agilent Bioanalyzer 2100 and KAPA Library Quantification Kit (KAPA Biosystems). Samples were sequenced on an Illumina NextSeq 500 instrument as 75-bp single-end libraries at the Babraham Institute Sequencing Facility. Quantitative, multiplexed rapid immunoprecipitation mass spectrometry of endogenous protein Following established protocols (42,43), cells were dissociated with Accutase, washed with PBS, and cross-linked with 2 mM di(N-succinimidyl) glutarate (Sigma-Aldrich) in PBS for 45 min at room temperature with shaking and then with 1% methanol-free paraformaldehyde (Agar Scientific) for 12.5 min at room temperature with shaking. Fixation was quenched with 0.125 M glycine for 5 min at room temperature with shaking. Cells were washed with PBS containing cOmplete EDTA-free protease inhibitors (Roche), and MEFs were depleted using MACS columns, as described above. Twenty million PSCs per RIME were resuspended in 10 ml of nuclei extraction buffer [10 mM Hepes (pH 7.5), 10 mM EDTA (pH 8.0), 0.5 mM EGTA, and 0.75% Triton X-100] and incubated for 10 min at 4°C with rotation. After centrifugation, nuclei were resuspended in 10 ml of nuclei wash buffer [10 mM Hepes (pH 7.5), 200 mM NaCl, 1 mM EDTA (pH 8.0), and 0.5 mM EGTA] and incubated for 10 min at 4°C with rotation. Cells were pelleted and resuspended in 600 l of lysis and sonication buffer [25 mM tris (pH 7.5), 150 mM NaCl, 5 mM EDTA (pH 8.0), 0.1% Triton X-100, 1% SDS, and 0.5% sodium deoxycholate], transferred to Protein LoBind tubes, and incubated for 30 min at 4°C to allow cell lysis. Chromatin was fragmented by sonication on a Microson ultrasonic cell disruptor XL Misonix wand sonicator with output setting of 10 W using 30, 32, or 34 sonication cycles (15 s on and 30 s off) for primed, 48-hour 5i/L/A and naïve PSC chromatin, respectively. Triton X-100 was added to the fragmented chromatin to a final 1% before immunoprecipitation. Magnetic Protein A Dynabeads (Thermo Fisher Scientific), 100 l per immunoprecipitation, were washed three times with cold 0.5% BSA/PBS, and 5 g of anti-PCGF3+5 antibody (Abcam, ab201510) was immobilized on the Protein A beads in 250 l of cold PBS/0.5% BSA for at least 5 hours at 4°C with rotation. The three-wash steps with 0.5% BSA/PBS were then repeated, and the fragmented chromatin was added to the beads overnight at 4°C with rotation. The next day, the protein of interest that was bound to the magnetic beads was washed stringently 10 times with RIME radioimmunoprecipitation assay (RIPA) buffer [50 mM Hepes (pH 7.5), 500 mM LiCl 500, 1 mM EDTA (pH 8.0), 1% NP-40, and 0.7% sodium deoxycholate] and twice with 100 mM ammonium bicarbonate. All visible liquid was removed from the beads, which were then frozen at −80°C and taken to the Cancer Research UK Cambridge Proteomics Facility for on-bead trypsin digestion, sample processing, multiplexing, MS, and preliminary analysis. Samples were prepared as described previously (43). Briefly, after on-bead tryptic digestion, C18 cleaned peptides were labeled with the TMT-16plex reagents (Thermo Fisher Scientific) for 1 hour. Samples were mixed and fractionated with reversed-phase cartridges at high pH (Pierce). Nine fractions were collected using different elution solutions in the range of 5 to 50% ACN (Acetonitrile). Peptide fractions were reconstituted in 0.1% formic acid and analyzed on a Dionex Ultimate 3000 ultrahigh-performance liquid chromatography system coupled with the nano-ESI Fusion Lumos mass spectrometer (Thermo Fisher Scientific). Samples were loaded on the Acclaim PepMap 100, 100 m by 2 cm C18, 5 m, 100-Å trapping column with the ulPickUp injection method using the loading pump at a flow rate of 5 l/min for 10 min. For the peptide separation, the EASY-Spray analytical column 75 m by 25 cm, C18, 2 m, 100-Å column was used for multistep gradient elution at a flow rate of 300 nl/min. Mobile phase A was composed of 2% acetonitrile and 0.1% formic acid, and mobile phase B was composed of 80% acetonitrile and 0.1% formic acid. Peptides were eluted using a gradient as follows: 0 to 10 min, 5% mobile phase B; 10 to 90 min, 5 to 38% mobile phase B; 90 to 100 min, 38 to 95% B; 100 to 105 min, 95% B; 105 to 110 min, 95 to 5% B; and 110 to 120 min, 5% B. Data-dependent acquisition began with an MS survey scan in the Orbitrap [380 to 1500 mass/charge ratio (m/z), resolution of 120,000 full width at half maximum (FWHM), automatic gain control (AGC) target of 3 × 10 5 , and maximum injection time of 100 ms]. MS2 analysis consisted of collision-induced dissociation (CID), quadrupole ion trap analysis, AGC target of 1 × 10 4 , normalized collision energy of 32, q value of 0.25, maximum injection time of 50 ms, an isolation window at 0.7, and a dynamic exclusion duration of 45 s. MS2-MS3 was conducted using sequential precursor selection methodology with the top10 setting. Higher energy collision dissociation (HCD)-MS3 analysis was performed with MS2 isolation window 2.0 Th. The HCD collision energy was set at 50%, and the detection was performed with Orbitrap resolution of 50,000 FWHM and in the scan range of 100 to 400 m/z. AGC target was 1 × 10 5 , with the maximum injection time of 105 ms. ChIP sequencing Cells were harvested, fixed, lysed, and sonicated as described above for qPLEX-RIME except that the MACS fibroblast depletion step was omitted. Fragmented chromatin was centrifuged at 20,000g for 15 min at 4°C, and the 600-l supernatant was retained and diluted 1:10 with ChIP dilution buffer [150 mM NaCl, 25 mM tris (pH 7.5), 5 mM EDTA, 1% Triton X-100, 0.1% SDS, and 0.5% sodium deoxycholate] to 6-ml total volume and split across six Protein LoBind tubes. Five percent of the diluted chromatin was taken as an input for each condition, and the remaining diluted supernatant was incubated with a total of 10 g of antibody overnight at 4°C. Antibodies were PCGF3+5 (Abcam, ab201510) and RING1B (65). Magnetic Protein A or G Dynabeads (Thermo Fisher Scientific), 60 l per immunoprecipitation, were washed three times with 1 ml of cold wash buffer A [50 mM tris (pH 8), 150 mM NaCl, 0.1% SDS, 0.5% sodium deoxycholate, 1% NP-40, and 1 mM EDTA] and then blocked for 1 hour with 4 l of yeast tRNA (10 mg/ml) (Thermo Fisher Scientific) and 10 l of BSA (10 mg/ml) (NEB) in 1 ml of cold wash buffer A at 4°C with rotation. Blocked beads were washed three times with 1 ml of wash buffer A, and the 60-l initial Protein A/G Dynabeads volume was made up to 200-l volume with wash buffer A and incubated overnight at 4°C with rotation. Protein A/G beads (33.3 l) were added to each of the six tubes containing antibody-bound chromatin, which was then incubated for at least 7 hours at 4°C with rotation to immobilize antibody-bound chromatin on the beads. The magnetic beads bound to antibody-chromatin complexes were rinsed once with wash buffer A and then washed twice for 10 min at 4°C with rotation with 1 ml of wash buffer A, and the beads split across six tubes were pooled into a single Protein LoBind tube across these two washes. The beads were washed once with 1 ml of wash buffer B [50 mM tris (pH 8.0), 500 mM NaCl, 0.1% SDS, 0.5% sodium deoxycholate, 1% NP-40, and 1 mM EDTA] and once with 1 ml of wash buffer C [50 mM tris (pH 8.0), 250 mM LiCl, 0.5% sodium deoxycholate, 1% NP-40, and 1 mM EDTA] and rinsed with 1× TE buffer. Chromatin was eluted off the beads in 450 l of elution buffer (1% SDS and 0.1 M NaHCO 3 ) containing 11 l of proteinase K (20 mg/ml), and 5 l ribonuclease A (10 mg/ml) (Promega) was added, including to the input samples, and incubated at 37°C for 2 hours, followed by an overnight incubation at 65°C with shaking to reverse cross-link protein from DNA. DNA was purified using AMPure XP beads (Beckman Coulter) into DNA LoBind tubes (Eppendorf) and eluted in 50 l of buffer EB. DNA was quantified using a Qubit fluorometer double-stranded DNA highsensitivity assay kit (Thermo Fisher Scientific), and libraries were prepared using a NEBNext Ultra II DNA library preparation kit for Illumina (NEB) using the manufacturer's protocol, with libraries indexed using NEBNext Multiplex Oligos for Illumina (Index Primers Set 1 and Set 2) (NEB). Following library preparation, library fragment size and concentration were determined using a Qubit fluorometer double-stranded DNA high-sensitivity assay kit, using an Agilent Bioanalyzer 2100, and the KAPA Library Quantification Kit (KAPA Biosystems). Samples were sequenced on an Illumina NextSeq500 instrument as high-output 75-bp single-end reads at the Babraham Institute Next Generation Sequencing Facility. Two independent biological replicates were prepared for each ChIP in each cell type.

Statistical analysis
CRISPR-Cas9 screen gRNA reads were first counted using an in-house program. Statistical analysis was then performed using MAGeCK (33) by comparing gRNA counts between (i) the day 21 cell population and the starting gRNA library plasmids for the primed proliferation screen and (ii) nascent naïve and refractory cell populations for the reprogramming screen. In the reprogramming screen, the gRNA counts for the two samples are compared to each other and not to the starting gRNA library. It is possible that some gRNAs will be enriched or depleted following expansion and selection; for example, genes essential for the growth of primed PSCs may be lost before reprogramming is initiated. This might eliminate genes that are broadly essential for cell growth, thereby allowing us to focus on identifying genes that are implicated in naïve cell reprogramming. Similarly, the experimental design is unaffected if any gRNAs are enriched in the starting cell population following expansion and selection, as these genes would still be found as "hits" if their relative abundance differed between the reprogrammed and not reprogrammed samples. Naïve colony-forming assay A one-sided Student's t test was used to compare nascent naïve colonies with refractory populations after cell sorting. Flow cytometry WT, PCGF3 heterozygous, and PCGF3 rescue lines were compared to both of the PCGF3 KO lines using a one-way analysis of variance (ANOVA) with Tukey correction. A two-tailed t test was used to calculate the P value for the difference in the proportion of reprogrammed cells within the population with HDAC inhibitor treatments.

RNA-seq analysis
RNA-seq reads were trimmed using trim galore v0.6.5 (www. bioinformatics.babraham.ac.uk/projects/trim_galore/) using default parameters to remove the standard Illumina adapter sequence. Reads were mapped to the human GRCh38 genome assembly using HISAT 2.1.0 guided by the gene models from the Ensembl v70 release. SAMtools was used to convert to BAM files that were imported to SeqMonk v45.0 (www.bioinformatics.babraham.ac.uk/ projects/seqmonk/). Raw read counts per transcript were calculated using the RNA-seq quantitation pipeline on the Ensembl v70 gene set using directional counts. Data were analyzed using DESeq and principal components analysis implemented in SeqMonk. Differentially expressed genes were identified using DESeq2, using a Wald test and Benjamini-Hochberg correction with a false discovery rate (FDR) of <0.05. P values for comparison of fold change in expression of PRC1.3 ChIP targets were calculated using a Kruskal-Wallis test with Dunn's multiple comparisons test. Published RNA-seq datasets (10,32,47,(66)(67)(68)(69) were analyzed in R or SeqMonk.

ChIP-seq analysis
Reads were trimmed using trim galore (www.bioinformatics.babraham. ac.uk/projects/trim_galore/) using default parameters to remove the standard Illumina adapter sequence and mapped to the human genome GRCh38 using Bowtie2. BAM files were imported to SeqMonk (www.bioinformatics.babraham.ac.uk/projects/seqmonk/), and reads were extended by 200 bp at their 5′ end to approximate the true insert size. Regions of coverage outliers were excluded. To identify PRC1.3 target promoters, 4-kb probes were generated centered on annotated transcriptional start sites. Nonduplicated reads were quantified and corrected per million mapped reads. Probes with log 2 RPM (Reads Per Million) > 3 in the RING1B ChIP samples and log 2 RPM > 1 in the PCGF3 ChIP samples were retained. Quantitation values were matched normalized across the RING1B ChIP datasets and were percentile normalized (75%) across the PCGF3 ChIP datasets. Probes in the PCGF3 ChIP samples were retained if they had a P value of <0.05 (after multiple testing correction) based on a Limma test between WT and PCGF3 KO samples. Retained RING1B and PCGF3 probes were overlapped to identify regions of PRC1.3 occupancy. Probes were name-matched to genes and deduplicated by name. ChIP-seq peaks were called using a MACS implementation in SeqMonk. Each ChIP replicate was analyzed separately using parameters q < 10 −12 for RING1B and q < 10 −5 for PCGF3 and with sonicated fragment size of 300. Peaks identified in both replicates were retained and were filtered by signal intensity, retaining only peaks that overlap with at least one 500-bp window in which log 2 RPM > 0. PCGF3 peaks were removed if they overlapped with peaks in PCGF3 KO samples (n = 20 peaks in primed cells and n = 4 peaks in naïve cells). Retained RING1B and PCGF3 peaks were overlapped to identify regions of PRC1.3 occupancy. Matched regions with RING1B occupancy but not PCGF3 occupancy were identified by filtering retained RING1B peaks that were further than 10 kb from a PCGF3 peak. The control peaks were filtered to select a set of regions with a similar distribution of RING1B ChIP quantified values compared to the PRC1.3 peak set. PRDM14 ChIP signal was compared between PCGF3-and RING1B-bound sites versus matched RING1B-only sites using a two-sided Mann-Whitney test.
Heatmaps were generated as follows. BAM files were sorted and indexed using SAMtools. Sorted indexed BAMs were converted to bigWig format using deepTools bamCoverage with the following parameters: --binSize 10 --normalizeUsing RPGC --effectiveGenomeSize 2913022398 --extendReads 200 --ignoreForNormalization ChrM ChrY --minMappingQuality 20. To generate coverage tracks for analysis over regions of interest, input files were subtracted from samples using bigwigCompare with the following parameter: --operation subtract. For visualization, input-subtracted replicates were merged using bigwigCompare with the following parameter: --operation mean. To generate aligned probe plots, first, computeMatrix was run on each input-normalized coverage file using the following parameters: --upstream 10000 --downstream 10000 --missingDataAsZero -skipZeros. Plots were then generated using plotHeatmap with interpolationMethod set to bilinear, and samples were sorted by PCGF3 signal.

Analysis of protein compartmentalization
Protein localization information was retrieved from the COM-PARTMENTS database from the Jensen Laboratory in json format using a REST API client (70). Within these files, confidence in the location of individual proteins is assigned a score based on experimental validation, knowledge, or various computational prediction methods. For proteins annotated within multiple subcellular compartments, the highest score was used. Plots were produced using R using custom scripts. Statistical significance of enrichment over a "background" gene set (all genes in the screening library) was carried out using a hypergeometric test. Adjusted P values were calculated using Fisher's exact test.

Gene ontology analysis
Gene ontology (GO) information was accessed through the R package enrichR (71) for the following GO terms: GO_Cellular_Component_2018, GO_Molecular_Function_2018, and GO_Biological_Process_2018. GO terms were ranked by significance, and −log 10 -adjusted P values were plotted using R. Adjusted P values were calculated using Fisher's exact test.

Motif analysis
Motif analysis was performed using AME (72) on the central 500-bp region within naïve PRC1.3 peaks. Regions matched for GC content were generated using custom Python scripts. The motif database used was HOCOMOCO v11 core human mono meme format. FIMO (73) was used to identify the number of PRDM14 motifs in PRC1.3 peaks versus matched RING1B-only peaks or versus regions matched for GC content. Over half of all naïve PRC1.3 peaks (trimmed to the central 500 bp) contained at least one PRDM14 motif (53%; 201 of 378) compared to ~6% in the RING1B-only peak set (6.4%; 19 of 295) and to ~5% in the GC-matched set (5.3%; 20 of 378). These values were compared using a two-sided Fisher's exact test. The significance of the enrichment of PRDM14 motifs in naïve PRC1.3 peaks compared to a control subset was compared using a one-sided Fisher's exact test with Bonferroni correction.

Gene set enrichment analysis
Gene set enrichment analysis (GSEA) was calculated using the GSEAPreranked tool within GSEA software (www.broadinstitute. org/gsea). Input data were a ranked gene list ordered by fold change expression between IAA-treated and untreated PRDM14-VENUS-AID PSCs [n = 35721; (47)] and a set of PRC1.3 targets (n = 267; defined by high RING1B and PCGF3 promoter-localized ChIP-seq values in PSCs). Default settings were used with 1000 gene set permutations. The positive enrichment score is calculated using a Kolmogorov-Smirnov test with FDR < 0.001. qPLEX-RIME analysis The Proteome Discoverer 2.1 (Thermo Fisher Scientific) was used for the processing of CID tandem mass spectra. The SequestHT search engine was used, and all the spectra were searched against the UniProt Homo sapiens FASTA database (taxon ID 9606, version January 2020). All searches were performed using a static modification TMT pro 16plex (+304.207 Da) at any N terminus and on lysine. Methionine oxidation (+15.9949 Da) and deamidation on asparagine and glutamine (+0.984) were included as dynamic modifications. Mass spectra were searched using precursor ion tolerance of 20 parts per million and fragment ion tolerance of 0.5 Da. For peptide confidence, 1% FDR was applied, and peptides uniquely matched to a protein were used for quantification.
Data processing, normalization, and statistical analysis were carried out using the qPLEXanalyzer (43) package from Bioconductor. Peptide intensities were normalized using median scaling, and protein level quantification was obtained by the summation of the normalized peptide intensities. A statistical analysis of differentially regulated proteins was carried out using the Limma method. Multiple testing correction of P values was applied using the Benjamini-Hochberg method to control the FDR.

SUPPLEMENTARY MATERIALS
Supplementary material for this article is available at https://science.org/doi/10.1126/ sciadv.abk0013 View/request a protocol for this paper from Bio-protocol.